Mining Biclusters of Similar Values with Triadic Concept Analysis

نویسندگان

  • Mehdi Kaytoue-Uberall
  • Sergei O. Kuznetsov
  • Juraj Macko
  • Wagner Meira
  • Amedeo Napoli
چکیده

Biclustering numerical data became a popular data-mining task in the beginning of 2000’s, especially for analysing gene expression data. A bicluster reflects a strong association between a subset of objects and a subset of attributes in a numerical object/attribute data-table. So called biclusters of similar values can be thought as maximal sub-tables with close values. Only few methods address a complete, correct and non redundant enumeration of such patterns, which is a well-known intractable problem, while no formal framework exists. In this paper, we introduce important links between biclustering and formal concept analysis. More specifically, we originally show that Triadic Concept Analysis (TCA), provides a nice mathematical framework for biclustering. Interestingly, existing algorithms of TCA, that usually apply on binary data, can be used (directly or with slight modifications) after a preprocessing step for extracting maximal biclusters of similar values.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Three Interrelated FCA Methods for Mining Biclusters of Similar Values on Columns

Biclustering numerical data tables consists in detecting particular and strong associations between both subsets of objects and attributes. Such biclusters are interesting since they model the data as local patterns. Whereas there exists several definitions of biclusters, depending on the constraints they should respect, we focus in this paper on biclusters of similar values on columns. There a...

متن کامل

Extraction de biclusters à valeurs similaires avec l’analyse de concepts triadiques

Biclustering numerical data became a popular datamining task in the beginning of 2000’s, especially for analysing gene expression data. A bicluster reflects a strong association between a subset of objects and a subset of attributes in a numerical object/attribute data-table. So called biclusters of similar values can be thought as maximal sub-tables with close values. Only few methods address ...

متن کامل

DNA Microarray Data Analysis: A Novel Biclustering Algorithm Approach

Biclustering algorithms refer to a distinct class of clustering algorithms that perform simultaneous row-column clustering. Biclustering problems arise in DNAmicroarray data analysis, collaborative filtering, market research, information retrieval, text mining, electoral trends, exchange analysis, and so forth. When dealing with DNA microarray experimental data for example, the goal of bicluste...

متن کامل

Enumerating all maximal biclusters in numerical datasets

Biclustering has proved to be a powerful data analysis technique due to its wide success in various application domains. However, the existing literature presents efficient solutions only for enumerating maximal biclusters with constant values, or heuristic-based approaches which can not find all biclusters or even support the maximality of the obtained biclusters. Here, we present a general fa...

متن کامل

Efficient mining of maximal biclusters in mixed-attribute datasets

This paper presents a novel enumerative biclustering algorithm to directly mine all maximal biclusters in mixed-attribute datasets, with or without missing values. The independent attributes are mixed or heterogeneous, in the sense that both numerical (real or integer values) and categorical (ordinal or nominal values) attribute types may appear together in the same dataset. The proposal is an ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011